Multilingual Entity Linking: Comparing English and Spanish

نویسندگان

Henry Rosales-Méndez

Barbara Poblete

Aidan Hogan

چکیده

The Entity Linking (EL) task is concerned with linking entity mentions in a text collection with their corresponding knowledgebase entries. The majority of approaches have focused on EL over English text collections. However, some approaches propose language-independent or multilingual approaches to perform EL over texts in many languages. In this paper, our goal is to see how well EL systems perform outside of the primary language (often English). We first provide a survey of EL approaches that present evaluation over multiple languages. We then provide results of an initial study comparing selected entity linking APIs for equivalent documents and sentences in English and Spanish. Multilingual EL approaches fare best for Spanish, though all approaches still perform better for English text than the corresponding Spanish text. This indicates that there is an important gap between EL techniques for English in relation to Spanish (and possibly for many other languages) which has not been addressed yet. However, we leave investigation of the causes of this gap for future work, which could be due to many factors, for example, to differences in existing multilingual knowledge bases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Event Detection using the NewsReader Pipelines

We describe a novel modular system for cross-lingual event extraction for English, Spanish,, Dutch and Italian texts. The system consists of a ready-to-use modular set of advanced multilingual Natural Language Processing (NLP) tools. The pipeline integrates modules for basic NLP processing as well as more advanced tasks such as cross-lingual Named Entity Linking, Semantic Role Labeling and time...

متن کامل

Cross-lingual Wikification Using Multilingual Embeddings

Cross-lingual Wikification is the task of grounding mentions written in non-English documents to entries in the English Wikipedia. This task involves the problem of comparing textual clues across languages, which requires developing a notion of similarity between text snippets across languages. In this paper, we address this problem by jointly training multilingual embeddings for words and Wiki...

متن کامل

RPI BLENDER TAC-KBP2016 System Description

We used Stanford Corenlp toolkit (Manning et al., 2014b) for English name tagging. To extract name mentions from Chinese and Spanish documents, we use bi-directional LSTMs (Long Short Term Memory) networks which can leverage long distance features. The input of the networks are pretrained word embeddings and randomly generalized character embeddings. Both word embedding and character embeddings...

متن کامل

NameTag(TM) Japanese and Spanish Systems as Used for MET

We have participated in the Multilingual Entity Task (MET) for Japanese and Spanish using SRA's multilingual text-indexing software called NameTag TM. Its English version was used for the Named Entity Task (NE) in MUC-6 [2]. The NameTag Japanese and Spanish systems were customized to accommodate the MET-specific requirements and were able to achieve high performance in both recall and precision.

متن کامل

UNIBA: Combining Distributional Semantic Models and Sense Distribution for Multilingual All-Words Sense Disambiguation and Entity Linking

This paper describes the participation of the UNIBA team in the Task 13 of SemEval-2015 about Multilingual All-Words Sense Disambiguation and Entity Linking. We propose an algorithm able to disambiguate both word senses and named entities by combining the simple Lesk approach with information coming from both a distributional semantic model and usage frequency of meanings. The results for both ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Multilingual Entity Linking: Comparing English and Spanish

نویسندگان

چکیده

منابع مشابه

Multilingual Event Detection using the NewsReader Pipelines

Cross-lingual Wikification Using Multilingual Embeddings

RPI BLENDER TAC-KBP2016 System Description

NameTag(TM) Japanese and Spanish Systems as Used for MET

UNIBA: Combining Distributional Semantic Models and Sense Distribution for Multilingual All-Words Sense Disambiguation and Entity Linking

عنوان ژورنال:

اشتراک گذاری